5.2 Cluster Segmentation 5-61
Exercises
1. Bank Direct Marketing Cluster Analysis
Given the inputs, do there exist clusters of customers in the bank direct marketing data set? This
exercise explores the bank direct marketing data and tries to profile the resulting clusters.
21. Create a new diagram in your project. Name the diagram Bank Clustering.
b. Use the bank__direct__marketing data as a data source for this clustering and profiling exercise.
c. Determine whether the model roles and measurement levels assigned to the variables are
appropriate.
Examine the distribution of the variables of following variables:
0 balance
0 day
0 previous
0 duration
0 Age
0 campaign
The three most heavily skewed distributions are for balance, campaign and previous. Although
not optimal, we could reduce the skewness of the distributions by taking the log of the variable.
d. Drag a Transform Variables node onto the diagram and connect it to the Input Data source.
e. Apply a log transformation to the following variables:
0 balance
0 previous
0 campaign
f. Connect a Cluster node to the Transform Variables node.
g. Change the Maximum Number of Clusters to 6.
h. Change Use of all the variables to No, except for
I balance
0 previous
0 duration
0 Age
0 campaign
5.2 Cluster Segmentation 5-61
5.2 Cluster Segmentation 5-61
5.2 Cluster Segmentation 5-61
Exercises
1. Bank Direct Marketing Cluster Analysis
Given the inputs, do there exist clusters of customers in the bank direct marketing data set? This
exercise explores the bank direct marketing data and tries to profile the resulting clusters.
21. Create a new diagram in your project. Name the diagram Bank Clustering.
b. Use the bank__direct__marketing data as a data source for this clustering and profiling exercise.
c. Determine whether the model roles and measurement levels assigned to the variables are
appropriate.
Examine the distribution of the variables of following variables:
0 balance
0 day
0 previous
0 duration
0 Age
0 campaign
The three most heavily skewed distributions are for balance, campaign and previous. Although
not optimal, we could reduce the skewness of the distributions by taking the log of the variable.
d. Drag a Transform Variables node onto the diagram and connect it to the Input Data source.
e. Apply a log transformation to the following variables:
0 balance
0 previous
0 campaign
f. Connect a Cluster node to the Transform Variables node.
g. Change the Maximum Number of Clusters to 6.
h. Change Use of all the variables to No, except for
I balance
0 previous
0 duration
0 Age
0 campaign